Construct Non-Hierarchical P/NBD Model for Online Retail Transaction Data
In this workbook we construct the non-hierarchical P/NBD models on the synthetic data with the longer timeframe.
1 Load and Construct Datasets
We start by modelling the P/NBD model using our synthetic datasets before we try to model real-life data.
Show code
use_fit_start_date <- as.Date("2009-12-01")
use_fit_end_date <- as.Date("2010-12-01")
use_valid_start_date <- as.Date("2010-12-01")
use_valid_end_date <- as.Date("2012-12-10")1.1 Load Online Retail Transaction Data
We now want to load the online retail transaction data.
Show code
customer_cohortdata_tbl <- read_rds("data/retail_data_cohort_tbl.rds")
customer_cohortdata_tbl |> glimpse()Rows: 5,852
Columns: 5
$ customer_id <chr> "12346", "12347", "12348", "12349", "12350", "12351", …
$ cohort_qtr <chr> "2010 Q1", "2010 Q4", "2010 Q3", "2010 Q2", "2011 Q1",…
$ cohort_ym <chr> "2010 03", "2010 10", "2010 09", "2010 04", "2011 02",…
$ first_tnx_date <date> 2010-03-02, 2010-10-31, 2010-09-27, 2010-04-29, 2011-…
$ total_tnx_count <int> 3, 8, 5, 3, 1, 1, 9, 2, 1, 2, 6, 2, 5, 10, 6, 4, 10, 2…
Show code
customer_transactions_tbl <- read_rds("data/retail_data_transactions_tbl.rds")
customer_transactions_tbl |> glimpse()Rows: 53,711
Columns: 4
$ tnx_timestamp <dttm> 2009-12-01 07:45:00, 2009-12-01 07:45:59, 2009-12-01 09…
$ invoice_id <chr> "489434", "489435", "489436", "489437", "489438", "48943…
$ customer_id <chr> "13085", "13085", "13078", "15362", "18102", "12682", "1…
$ tnx_amount <dbl> 505.30, 145.80, 630.33, 310.75, 2286.24, 426.30, 50.40, …
We re-produce the visualisation of the transaction times we used in previous workbooks.
Show code
plot_tbl <- customer_transactions_tbl |>
group_nest(customer_id, .key = "cust_data") |>
filter(map_int(cust_data, nrow) > 3) |>
slice_sample(n = 30) |>
unnest(cust_data)
ggplot(plot_tbl, aes(x = tnx_timestamp, y = customer_id)) +
geom_line() +
geom_point() +
labs(
x = "Date",
y = "Customer ID",
title = "Visualisation of Customer Transaction Times"
) +
theme(axis.text.y = element_text(size = 10))1.2 Construct Datasets
Having loaded the synthetic data we need to construct a number of datasets of derived values.
Show code
customer_summarystats_tbl <- customer_transactions_tbl |>
drop_na(customer_id) |>
calculate_transaction_cbs_data(last_date = use_fit_end_date |> as.POSIXct())
customer_summarystats_tbl |> glimpse()Rows: 4,336
Columns: 6
$ customer_id <chr> "12346", "12347", "12348", "12349", "12351", "12352", "…
$ first_tnx_date <dttm> 2009-12-14 08:34:00, 2010-10-31 14:19:59, 2010-09-27 1…
$ last_tnx_date <dttm> 2010-10-04 16:32:59, 2010-10-31 14:19:59, 2010-09-27 1…
$ x <dbl> 14, 0, 0, 3, 0, 1, 0, 0, 2, 1, 2, 7, 5, 2, 0, 0, 0, 2, …
$ t_x <dbl> 42.04751984, 0.00000000, 0.00000000, 46.83075397, 0.000…
$ T_cal <dbl> 50.2347222, 4.3432540, 9.1965278, 51.6379960, 0.1941468…
As before, we construct a number of subsets of the data for use later on with the modelling and create some data subsets.
Show code
customer_fit_stats_tbl <- customer_summarystats_tbl
customer_fit_stats_tbl |> glimpse()Rows: 4,336
Columns: 6
$ customer_id <chr> "12346", "12347", "12348", "12349", "12351", "12352", "…
$ first_tnx_date <dttm> 2009-12-14 08:34:00, 2010-10-31 14:19:59, 2010-09-27 1…
$ last_tnx_date <dttm> 2010-10-04 16:32:59, 2010-10-31 14:19:59, 2010-09-27 1…
$ x <dbl> 14, 0, 0, 3, 0, 1, 0, 0, 2, 1, 2, 7, 5, 2, 0, 0, 0, 2, …
$ t_x <dbl> 42.04751984, 0.00000000, 0.00000000, 46.83075397, 0.000…
$ T_cal <dbl> 50.2347222, 4.3432540, 9.1965278, 51.6379960, 0.1941468…
Show code
customer_valid_stats_tbl <- customer_transactions_tbl |>
drop_na(customer_id) |>
filter(
tnx_timestamp > (use_valid_start_date |> as.POSIXct())
) |>
summarise(
tnx_count = n(),
tnx_last_interval = difftime(
max(tnx_timestamp),
use_valid_start_date,
units = "weeks"
) |>
as.numeric(),
.by = customer_id
)
customer_valid_stats_tbl |> glimpse()Rows: 4,372
Columns: 3
$ customer_id <chr> "17850", "13047", "12583", "13748", "15100", "15291"…
$ tnx_count <int> 35, 18, 18, 5, 6, 20, 27, 15, 118, 86, 8, 1, 3, 76, …
$ tnx_last_interval <dbl> 10.22996032, 48.92956349, 53.04831349, 39.77232143, …
Show code
obs_fitdata_tbl <- customer_fit_stats_tbl |>
rename(tnx_count = x)
### We need to add all the zero count customers into the valid data
obs_validdata_tbl <- customer_fit_stats_tbl |>
anti_join(customer_valid_stats_tbl, by = "customer_id") |>
transmute(customer_id, tnx_count = 0) |>
bind_rows(customer_valid_stats_tbl) |>
arrange(customer_id)We then write this data to disk.
Show code
#! echo: TRUE
obs_fitdata_tbl |> write_rds("data/onlineretail_obs_fitdata_tbl.rds")
obs_validdata_tbl |> write_rds("data/onlineretail_obs_validdata_tbl.rds")2 Fit First P/NBD Model
We now construct our Stan model and prepare to fit it with our synthetic dataset.
Before we start on that, we set a few parameters for the workbook to organise our Stan code.
Show code
stan_modeldir <- "stan_models"
stan_codedir <- "stan_code"We also want to set a number of overall parameters for this workbook
To start the fit data, we want to use the 1,000 customers. We also need to calculate the summary statistics for the validation period.
2.1 Compile and Fit Stan Model
We now compile this model using CmdStanR.
Show code
pnbd_fixed_stanmodel <- cmdstan_model(
"stan_code/pnbd_fixed.stan",
include_paths = stan_codedir,
pedantic = TRUE,
dir = stan_modeldir
)We then use this compiled model with our data to produce a fit of the data.
Show code
stan_modelname <- "pnbd_onlineretail_fixed"
stanfit_prefix <- str_c("fit_", stan_modelname)
stan_data_lst <- customer_fit_stats_tbl |>
select(customer_id, x, t_x, T_cal) |>
compose_data(
lambda_mn = 0.25,
lambda_cv = 1.00,
mu_mn = 0.10,
mu_cv = 1.00,
)
pnbd_onlineretail_fixed1_stanfit <- pnbd_fixed_stanmodel$sample(
data = stan_data_lst,
chains = 4,
iter_warmup = 500,
iter_sampling = 500,
seed = 4201,
save_warmup = TRUE,
output_dir = stan_modeldir,
output_basename = stanfit_prefix,
)Running MCMC with 4 chains, at most 8 in parallel...
Chain 1 Iteration: 1 / 1000 [ 0%] (Warmup)
Chain 2 Iteration: 1 / 1000 [ 0%] (Warmup)
Chain 3 Iteration: 1 / 1000 [ 0%] (Warmup)
Chain 4 Iteration: 1 / 1000 [ 0%] (Warmup)
Chain 1 Iteration: 100 / 1000 [ 10%] (Warmup)
Chain 4 Iteration: 100 / 1000 [ 10%] (Warmup)
Chain 2 Iteration: 100 / 1000 [ 10%] (Warmup)
Chain 3 Iteration: 100 / 1000 [ 10%] (Warmup)
Chain 1 Iteration: 200 / 1000 [ 20%] (Warmup)
Chain 4 Iteration: 200 / 1000 [ 20%] (Warmup)
Chain 2 Iteration: 200 / 1000 [ 20%] (Warmup)
Chain 3 Iteration: 200 / 1000 [ 20%] (Warmup)
Chain 1 Iteration: 300 / 1000 [ 30%] (Warmup)
Chain 4 Iteration: 300 / 1000 [ 30%] (Warmup)
Chain 2 Iteration: 300 / 1000 [ 30%] (Warmup)
Chain 3 Iteration: 300 / 1000 [ 30%] (Warmup)
Chain 1 Iteration: 400 / 1000 [ 40%] (Warmup)
Chain 4 Iteration: 400 / 1000 [ 40%] (Warmup)
Chain 2 Iteration: 400 / 1000 [ 40%] (Warmup)
Chain 3 Iteration: 400 / 1000 [ 40%] (Warmup)
Chain 4 Iteration: 500 / 1000 [ 50%] (Warmup)
Chain 4 Iteration: 501 / 1000 [ 50%] (Sampling)
Chain 1 Iteration: 500 / 1000 [ 50%] (Warmup)
Chain 1 Iteration: 501 / 1000 [ 50%] (Sampling)
Chain 3 Iteration: 500 / 1000 [ 50%] (Warmup)
Chain 3 Iteration: 501 / 1000 [ 50%] (Sampling)
Chain 2 Iteration: 500 / 1000 [ 50%] (Warmup)
Chain 2 Iteration: 501 / 1000 [ 50%] (Sampling)
Chain 4 Iteration: 600 / 1000 [ 60%] (Sampling)
Chain 1 Iteration: 600 / 1000 [ 60%] (Sampling)
Chain 3 Iteration: 600 / 1000 [ 60%] (Sampling)
Chain 2 Iteration: 600 / 1000 [ 60%] (Sampling)
Chain 4 Iteration: 700 / 1000 [ 70%] (Sampling)
Chain 1 Iteration: 700 / 1000 [ 70%] (Sampling)
Chain 3 Iteration: 700 / 1000 [ 70%] (Sampling)
Chain 2 Iteration: 700 / 1000 [ 70%] (Sampling)
Chain 4 Iteration: 800 / 1000 [ 80%] (Sampling)
Chain 1 Iteration: 800 / 1000 [ 80%] (Sampling)
Chain 3 Iteration: 800 / 1000 [ 80%] (Sampling)
Chain 2 Iteration: 800 / 1000 [ 80%] (Sampling)
Chain 4 Iteration: 900 / 1000 [ 90%] (Sampling)
Chain 1 Iteration: 900 / 1000 [ 90%] (Sampling)
Chain 3 Iteration: 900 / 1000 [ 90%] (Sampling)
Chain 2 Iteration: 900 / 1000 [ 90%] (Sampling)
Chain 4 Iteration: 1000 / 1000 [100%] (Sampling)
Chain 4 finished in 90.6 seconds.
Chain 1 Iteration: 1000 / 1000 [100%] (Sampling)
Chain 1 finished in 91.5 seconds.
Chain 3 Iteration: 1000 / 1000 [100%] (Sampling)
Chain 3 finished in 91.9 seconds.
Chain 2 Iteration: 1000 / 1000 [100%] (Sampling)
Chain 2 finished in 93.9 seconds.
All 4 chains finished successfully.
Mean chain execution time: 92.0 seconds.
Total execution time: 94.3 seconds.
Show code
pnbd_onlineretail_fixed1_stanfit$summary()# A tibble: 13,009 × 10
variable mean median sd mad q5 q95 rhat ess_bulk
<chr> <num> <num> <num> <num> <num> <num> <num> <num>
1 lp__ -7.51e+4 -7.51e+4 74.4 75.5 -7.52e+4 -7.50e+4 1.00 721.
2 lambda[1] 2.92e-1 2.85e-1 0.0803 0.0768 1.78e-1 4.37e-1 1.00 1907.
3 lambda[2] 1.44e-1 9.36e-2 0.153 0.0987 8.00e-3 4.47e-1 1.00 2059.
4 lambda[3] 1.28e-1 7.64e-2 0.154 0.0814 4.76e-3 4.13e-1 1.00 1479.
5 lambda[4] 7.19e-2 6.65e-2 0.0361 0.0337 2.44e-2 1.39e-1 1.00 1795.
6 lambda[5] 2.45e-1 1.79e-1 0.230 0.180 1.44e-2 6.87e-1 1.00 1485.
7 lambda[6] 2.98e-1 2.45e-1 0.216 0.178 5.40e-2 7.19e-1 1.00 1754.
8 lambda[7] 1.47e-1 9.90e-2 0.157 0.101 7.20e-3 4.44e-1 1.00 1567.
9 lambda[8] 1.38e-1 8.11e-2 0.156 0.0925 7.14e-3 4.53e-1 1.00 1940.
10 lambda[9] 2.70e-1 2.41e-1 0.154 0.138 7.71e-2 5.67e-1 1.00 1668.
# ℹ 12,999 more rows
# ℹ 1 more variable: ess_tail <num>
We have some basic HMC-based validity statistics we can check.
Show code
pnbd_onlineretail_fixed1_stanfit$cmdstan_diagnose()Processing csv files: /home/rstudio/btydwork/stan_models/fit_pnbd_onlineretail_fixed-1.csvWarning: non-fatal error reading adaptation data
, /home/rstudio/btydwork/stan_models/fit_pnbd_onlineretail_fixed-2.csvWarning: non-fatal error reading adaptation data
, /home/rstudio/btydwork/stan_models/fit_pnbd_onlineretail_fixed-3.csvWarning: non-fatal error reading adaptation data
, /home/rstudio/btydwork/stan_models/fit_pnbd_onlineretail_fixed-4.csvWarning: non-fatal error reading adaptation data
Checking sampler transitions treedepth.
Treedepth satisfactory for all transitions.
Checking sampler transitions for divergences.
No divergent transitions found.
Checking E-BFMI - sampler transitions HMC potential energy.
E-BFMI satisfactory.
Effective sample size satisfactory.
Split R-hat values satisfactory all parameters.
Processing complete, no problems detected.
2.2 Visual Diagnostics of the Sample Validity
Now that we have a sample from the posterior distribution we need to create a few different visualisations of the diagnostics.
Show code
parameter_subset <- c(
"lambda[1]", "lambda[2]", "lambda[3]", "lambda[4]",
"mu[1]", "mu[2]", "mu[3]", "mu[4]"
)
pnbd_onlineretail_fixed1_stanfit$draws(inc_warmup = FALSE) |>
mcmc_trace(pars = parameter_subset) +
expand_limits(y = 0) +
labs(
x = "Iteration",
y = "Value",
title = "Traceplot of Sample of Lambda and Mu Values"
) +
theme(axis.text.x = element_text(size = 10))We also check \(N_{eff}\) as a quick diagnostic of the fit.
Show code
pnbd_onlineretail_fixed1_stanfit |>
neff_ratio(pars = c("lambda", "mu")) |>
as.numeric() |>
mcmc_neff() +
ggtitle("Plot of Parameter Effective Sample Sizes")2.3 Assess the Model
As we intend to run the same logic to assess each of our models, we have combined all this logic into a single function run_model_assessment, to run the simulations and combine the datasets.
2.3.1 Check In-Sample Data Validation
We first check the model against the in-sample data.
Show code
assess_data_lst <- run_model_assessment(
model_stanfit = pnbd_onlineretail_fixed1_stanfit,
insample_tbl = customer_fit_stats_tbl,
outsample_tbl = customer_valid_stats_tbl,
fit_label = "pnbd_onlineretail_fixed",
fit_end_dttm = use_fit_end_date |> as.POSIXct(),
valid_start_dttm = use_valid_start_date |> as.POSIXct(),
valid_end_dttm = use_valid_end_date |> as.POSIXct(),
sim_seed = 10
)
insample_plots_lst <- create_model_assessment_plots(
obsdata_tbl = obs_fitdata_tbl,
simdata_tbl = assess_data_lst$model_fit_simstats_tbl
)
insample_plots_lst$multi_plot |> print()Show code
insample_plots_lst$total_plot |> print()Show code
insample_plots_lst$quant_plot |> print()This fit looks reasonable and appears to capture most of the aspects of the data used to fit it. Given that this is a synthetic dataset, this is not surprising, but at least we appreciate that our model is valid.
2.3.2 Check Out-of-Sample Data Validation
We now repeat for the out-of-sample data.
Show code
outsample_plots_lst <- create_model_assessment_plots(
obsdata_tbl = obs_validdata_tbl,
simdata_tbl = assess_data_lst$model_valid_simstats_tbl
)
outsample_plots_lst$multi_plot |> print()Show code
outsample_plots_lst$total_plot |> print()Show code
outsample_plots_lst$quant_plot |> print()As for our short time frame data, overall our model is working well.
2.4 Write Assess Data to Disk
We now store this assessment data to disk.
Show code
assess_data_lst |> write_rds("data/pnbd_onlineretail_fixed1_assess_data_lst.rds")3 Fit Alternate Prior Model.
We want to try an alternate prior model with a smaller co-efficient of variation to see what impact it has on our procedures.
Show code
stan_modelname <- "pnbd_onlineretail_fixed2"
stanfit_prefix <- str_c("fit_", stan_modelname)
stanfit_seed <- stanfit_seed + 1
stan_data_lst <- customer_fit_stats_tbl |>
select(customer_id, x, t_x, T_cal) |>
compose_data(
lambda_mn = 0.25,
lambda_cv = 0.50,
mu_mn = 0.10,
mu_cv = 0.50,
)
pnbd_onlineretail_fixed2_stanfit <- pnbd_fixed_stanmodel$sample(
data = stan_data_lst,
chains = 4,
iter_warmup = 500,
iter_sampling = 500,
seed = stanfit_seed,
save_warmup = TRUE,
output_dir = stan_modeldir,
output_basename = stanfit_prefix,
)Running MCMC with 4 chains, at most 8 in parallel...
Chain 1 Iteration: 1 / 1000 [ 0%] (Warmup)
Chain 2 Iteration: 1 / 1000 [ 0%] (Warmup)
Chain 3 Iteration: 1 / 1000 [ 0%] (Warmup)
Chain 4 Iteration: 1 / 1000 [ 0%] (Warmup)
Chain 1 Iteration: 100 / 1000 [ 10%] (Warmup)
Chain 3 Iteration: 100 / 1000 [ 10%] (Warmup)
Chain 2 Iteration: 100 / 1000 [ 10%] (Warmup)
Chain 4 Iteration: 100 / 1000 [ 10%] (Warmup)
Chain 1 Iteration: 200 / 1000 [ 20%] (Warmup)
Chain 3 Iteration: 200 / 1000 [ 20%] (Warmup)
Chain 4 Iteration: 200 / 1000 [ 20%] (Warmup)
Chain 2 Iteration: 200 / 1000 [ 20%] (Warmup)
Chain 1 Iteration: 300 / 1000 [ 30%] (Warmup)
Chain 3 Iteration: 300 / 1000 [ 30%] (Warmup)
Chain 4 Iteration: 300 / 1000 [ 30%] (Warmup)
Chain 2 Iteration: 300 / 1000 [ 30%] (Warmup)
Chain 1 Iteration: 400 / 1000 [ 40%] (Warmup)
Chain 4 Iteration: 400 / 1000 [ 40%] (Warmup)
Chain 3 Iteration: 400 / 1000 [ 40%] (Warmup)
Chain 2 Iteration: 400 / 1000 [ 40%] (Warmup)
Chain 1 Iteration: 500 / 1000 [ 50%] (Warmup)
Chain 1 Iteration: 501 / 1000 [ 50%] (Sampling)
Chain 4 Iteration: 500 / 1000 [ 50%] (Warmup)
Chain 4 Iteration: 501 / 1000 [ 50%] (Sampling)
Chain 3 Iteration: 500 / 1000 [ 50%] (Warmup)
Chain 3 Iteration: 501 / 1000 [ 50%] (Sampling)
Chain 2 Iteration: 500 / 1000 [ 50%] (Warmup)
Chain 2 Iteration: 501 / 1000 [ 50%] (Sampling)
Chain 1 Iteration: 600 / 1000 [ 60%] (Sampling)
Chain 4 Iteration: 600 / 1000 [ 60%] (Sampling)
Chain 3 Iteration: 600 / 1000 [ 60%] (Sampling)
Chain 2 Iteration: 600 / 1000 [ 60%] (Sampling)
Chain 1 Iteration: 700 / 1000 [ 70%] (Sampling)
Chain 4 Iteration: 700 / 1000 [ 70%] (Sampling)
Chain 3 Iteration: 700 / 1000 [ 70%] (Sampling)
Chain 2 Iteration: 700 / 1000 [ 70%] (Sampling)
Chain 1 Iteration: 800 / 1000 [ 80%] (Sampling)
Chain 4 Iteration: 800 / 1000 [ 80%] (Sampling)
Chain 3 Iteration: 800 / 1000 [ 80%] (Sampling)
Chain 2 Iteration: 800 / 1000 [ 80%] (Sampling)
Chain 1 Iteration: 900 / 1000 [ 90%] (Sampling)
Chain 4 Iteration: 900 / 1000 [ 90%] (Sampling)
Chain 3 Iteration: 900 / 1000 [ 90%] (Sampling)
Chain 2 Iteration: 900 / 1000 [ 90%] (Sampling)
Chain 1 Iteration: 1000 / 1000 [100%] (Sampling)
Chain 1 finished in 70.5 seconds.
Chain 4 Iteration: 1000 / 1000 [100%] (Sampling)
Chain 4 finished in 70.6 seconds.
Chain 3 Iteration: 1000 / 1000 [100%] (Sampling)
Chain 3 finished in 71.4 seconds.
Chain 2 Iteration: 1000 / 1000 [100%] (Sampling)
Chain 2 finished in 71.5 seconds.
All 4 chains finished successfully.
Mean chain execution time: 71.0 seconds.
Total execution time: 72.0 seconds.
Show code
pnbd_onlineretail_fixed2_stanfit$summary()# A tibble: 13,009 × 10
variable mean median sd mad q5 q95 rhat ess_bulk
<chr> <num> <num> <num> <num> <num> <num> <num> <num>
1 lp__ -1.53e+5 -1.53e+5 65.3 64.5 -1.53e+5 -1.53e+5 1.01 624.
2 lambda[1] 2.90e-1 2.84e-1 0.0692 0.0665 1.83e-1 4.10e-1 1.00 2937.
3 lambda[2] 2.06e-1 1.92e-1 0.102 0.0957 6.82e-2 3.93e-1 1.00 2141.
4 lambda[3] 2.05e-1 1.84e-1 0.109 0.102 6.49e-2 4.14e-1 1.00 2705.
5 lambda[4] 1.04e-1 1.00e-1 0.0389 0.0366 4.84e-2 1.78e-1 1.00 2984.
6 lambda[5] 2.44e-1 2.27e-1 0.118 0.112 8.41e-2 4.65e-1 1.00 2542.
7 lambda[6] 2.68e-1 2.47e-1 0.124 0.114 1.06e-1 4.99e-1 1.00 2804.
8 lambda[7] 2.09e-1 1.88e-1 0.109 0.101 6.61e-2 4.11e-1 0.999 2163.
9 lambda[8] 2.11e-1 1.92e-1 0.111 0.0981 6.43e-2 4.15e-1 1.01 2793.
10 lambda[9] 2.59e-1 2.45e-1 0.106 0.104 1.11e-1 4.47e-1 1.00 3605.
# ℹ 12,999 more rows
# ℹ 1 more variable: ess_tail <num>
We have some basic HMC-based validity statistics we can check.
Show code
pnbd_onlineretail_fixed2_stanfit$cmdstan_diagnose()Processing csv files: /home/rstudio/btydwork/stan_models/fit_pnbd_onlineretail_fixed2-1.csvWarning: non-fatal error reading adaptation data
, /home/rstudio/btydwork/stan_models/fit_pnbd_onlineretail_fixed2-2.csvWarning: non-fatal error reading adaptation data
, /home/rstudio/btydwork/stan_models/fit_pnbd_onlineretail_fixed2-3.csvWarning: non-fatal error reading adaptation data
, /home/rstudio/btydwork/stan_models/fit_pnbd_onlineretail_fixed2-4.csvWarning: non-fatal error reading adaptation data
Checking sampler transitions treedepth.
Treedepth satisfactory for all transitions.
Checking sampler transitions for divergences.
No divergent transitions found.
Checking E-BFMI - sampler transitions HMC potential energy.
E-BFMI satisfactory.
Effective sample size satisfactory.
Split R-hat values satisfactory all parameters.
Processing complete, no problems detected.
3.1 Visual Diagnostics of the Sample Validity
Now that we have a sample from the posterior distribution we need to create a few different visualisations of the diagnostics.
Show code
parameter_subset <- c(
"lambda[1]", "lambda[2]", "lambda[3]", "lambda[4]",
"mu[1]", "mu[2]", "mu[3]", "mu[4]"
)
pnbd_onlineretail_fixed2_stanfit$draws(inc_warmup = FALSE) |>
mcmc_trace(pars = parameter_subset) +
expand_limits(y = 0) +
labs(
x = "Iteration",
y = "Value",
title = "Traceplot of Sample of Lambda and Mu Values"
) +
theme(axis.text.x = element_text(size = 10))We want to check the \(N_{eff}\) statistics also.
Show code
pnbd_onlineretail_fixed2_stanfit |>
neff_ratio(pars = c("lambda", "mu")) |>
as.numeric() |>
mcmc_neff() +
ggtitle("Plot of Parameter Effective Sample Sizes")3.2 Assess the Model
As we intend to run the same logic to assess each of our models, we have combined all this logic into a single function run_model_assessment, to run the simulations and combine the datasets.
3.2.1 Check In-Sample Data Validation
We first check the model against the in-sample data.
Show code
assess_data_lst <- run_model_assessment(
model_stanfit = pnbd_onlineretail_fixed2_stanfit,
insample_tbl = customer_fit_stats_tbl,
outsample_tbl = customer_valid_stats_tbl,
fit_label = "pnbd_onlineretail_fixed2",
fit_end_dttm = use_fit_end_date |> as.POSIXct(),
valid_start_dttm = use_valid_start_date |> as.POSIXct(),
valid_end_dttm = use_valid_end_date |> as.POSIXct(),
sim_seed = 20
)
insample_plots_lst <- create_model_assessment_plots(
obsdata_tbl = obs_fitdata_tbl,
simdata_tbl = assess_data_lst$model_fit_simstats_tbl
)
insample_plots_lst$multi_plot |> print()Show code
insample_plots_lst$total_plot |> print()Show code
insample_plots_lst$quant_plot |> print()This fit looks reasonable and appears to capture most of the aspects of the data used to fit it. Given that this is a synthetic dataset, this is not surprising, but at least we appreciate that our model is valid.
3.2.2 Check Out-of-Sample Data Validation
We now repeat for the out-of-sample data.
Show code
outsample_plots_lst <- create_model_assessment_plots(
obsdata_tbl = obs_validdata_tbl,
simdata_tbl = assess_data_lst$model_valid_simstats_tbl
)
outsample_plots_lst$multi_plot |> print()Show code
outsample_plots_lst$total_plot |> print()Show code
outsample_plots_lst$quant_plot |> print()3.3 Write Assess Data to Disk
We now store this assessment data to disk.
Show code
assess_data_lst |> write_rds("data/pnbd_onlineretail_fixed2_assess_data_lst.rds")4 Fit Tight-Lifetime Model
We now want to try a model where we use priors with a tighter coefficient of variation for lifetime but keep the CoV for transaction frequency.
Show code
stan_modelname <- "pnbd_onlineretail_fixed3"
stanfit_prefix <- str_c("fit_", stan_modelname)
stanfit_seed <- stanfit_seed + 1
stan_data_lst <- customer_fit_stats_tbl |>
select(customer_id, x, t_x, T_cal) |>
compose_data(
lambda_mn = 0.25,
lambda_cv = 1.00,
mu_mn = 0.10,
mu_cv = 0.50,
)
pnbd_onlineretail_fixed3_stanfit <- pnbd_fixed_stanmodel$sample(
data = stan_data_lst,
chains = 4,
iter_warmup = 500,
iter_sampling = 500,
seed = stanfit_seed,
save_warmup = TRUE,
output_dir = stan_modeldir,
output_basename = stanfit_prefix,
)Running MCMC with 4 chains, at most 8 in parallel...
Chain 1 Iteration: 1 / 1000 [ 0%] (Warmup)
Chain 2 Iteration: 1 / 1000 [ 0%] (Warmup)
Chain 3 Iteration: 1 / 1000 [ 0%] (Warmup)
Chain 4 Iteration: 1 / 1000 [ 0%] (Warmup)
Chain 4 Iteration: 100 / 1000 [ 10%] (Warmup)
Chain 3 Iteration: 100 / 1000 [ 10%] (Warmup)
Chain 1 Iteration: 100 / 1000 [ 10%] (Warmup)
Chain 2 Iteration: 100 / 1000 [ 10%] (Warmup)
Chain 4 Iteration: 200 / 1000 [ 20%] (Warmup)
Chain 3 Iteration: 200 / 1000 [ 20%] (Warmup)
Chain 1 Iteration: 200 / 1000 [ 20%] (Warmup)
Chain 2 Iteration: 200 / 1000 [ 20%] (Warmup)
Chain 4 Iteration: 300 / 1000 [ 30%] (Warmup)
Chain 3 Iteration: 300 / 1000 [ 30%] (Warmup)
Chain 2 Iteration: 300 / 1000 [ 30%] (Warmup)
Chain 1 Iteration: 300 / 1000 [ 30%] (Warmup)
Chain 4 Iteration: 400 / 1000 [ 40%] (Warmup)
Chain 3 Iteration: 400 / 1000 [ 40%] (Warmup)
Chain 2 Iteration: 400 / 1000 [ 40%] (Warmup)
Chain 1 Iteration: 400 / 1000 [ 40%] (Warmup)
Chain 4 Iteration: 500 / 1000 [ 50%] (Warmup)
Chain 4 Iteration: 501 / 1000 [ 50%] (Sampling)
Chain 3 Iteration: 500 / 1000 [ 50%] (Warmup)
Chain 3 Iteration: 501 / 1000 [ 50%] (Sampling)
Chain 2 Iteration: 500 / 1000 [ 50%] (Warmup)
Chain 2 Iteration: 501 / 1000 [ 50%] (Sampling)
Chain 1 Iteration: 500 / 1000 [ 50%] (Warmup)
Chain 1 Iteration: 501 / 1000 [ 50%] (Sampling)
Chain 4 Iteration: 600 / 1000 [ 60%] (Sampling)
Chain 3 Iteration: 600 / 1000 [ 60%] (Sampling)
Chain 2 Iteration: 600 / 1000 [ 60%] (Sampling)
Chain 1 Iteration: 600 / 1000 [ 60%] (Sampling)
Chain 4 Iteration: 700 / 1000 [ 70%] (Sampling)
Chain 3 Iteration: 700 / 1000 [ 70%] (Sampling)
Chain 2 Iteration: 700 / 1000 [ 70%] (Sampling)
Chain 1 Iteration: 700 / 1000 [ 70%] (Sampling)
Chain 4 Iteration: 800 / 1000 [ 80%] (Sampling)
Chain 3 Iteration: 800 / 1000 [ 80%] (Sampling)
Chain 2 Iteration: 800 / 1000 [ 80%] (Sampling)
Chain 1 Iteration: 800 / 1000 [ 80%] (Sampling)
Chain 4 Iteration: 900 / 1000 [ 90%] (Sampling)
Chain 3 Iteration: 900 / 1000 [ 90%] (Sampling)
Chain 2 Iteration: 900 / 1000 [ 90%] (Sampling)
Chain 1 Iteration: 900 / 1000 [ 90%] (Sampling)
Chain 4 Iteration: 1000 / 1000 [100%] (Sampling)
Chain 4 finished in 91.6 seconds.
Chain 3 Iteration: 1000 / 1000 [100%] (Sampling)
Chain 3 finished in 93.9 seconds.
Chain 2 Iteration: 1000 / 1000 [100%] (Sampling)
Chain 2 finished in 95.7 seconds.
Chain 1 Iteration: 1000 / 1000 [100%] (Sampling)
Chain 1 finished in 96.6 seconds.
All 4 chains finished successfully.
Mean chain execution time: 94.5 seconds.
Total execution time: 97.6 seconds.
Show code
pnbd_onlineretail_fixed3_stanfit$summary()# A tibble: 13,009 × 10
variable mean median sd mad q5 q95 rhat ess_bulk
<chr> <num> <num> <num> <num> <num> <num> <num> <num>
1 lp__ -1.20e+5 -1.20e+5 70.1 68.2 -1.20e+5 -1.20e+5 1.00 644.
2 lambda[1] 2.99e-1 2.92e-1 0.0806 0.0752 1.81e-1 4.48e-1 1.00 3475.
3 lambda[2] 1.47e-1 9.93e-2 0.150 0.100 6.88e-3 4.47e-1 1.00 2245.
4 lambda[3] 1.33e-1 8.35e-2 0.151 0.0888 5.57e-3 4.14e-1 1.00 2236.
5 lambda[4] 7.25e-2 6.78e-2 0.0349 0.0340 2.61e-2 1.35e-1 1.01 2864.
6 lambda[5] 2.33e-1 1.54e-1 0.234 0.157 1.49e-2 6.91e-1 1.00 2427.
7 lambda[6] 3.02e-1 2.54e-1 0.212 0.187 5.48e-2 7.07e-1 1.00 2564.
8 lambda[7] 1.48e-1 9.77e-2 0.169 0.103 5.96e-3 4.47e-1 1.00 2720.
9 lambda[8] 1.45e-1 8.12e-2 0.172 0.0909 6.48e-3 5.03e-1 0.999 3156.
10 lambda[9] 2.64e-1 2.31e-1 0.151 0.138 7.14e-2 5.54e-1 1.00 3361.
# ℹ 12,999 more rows
# ℹ 1 more variable: ess_tail <num>
We have some basic HMC-based validity statistics we can check.
Show code
pnbd_onlineretail_fixed3_stanfit$cmdstan_diagnose()Processing csv files: /home/rstudio/btydwork/stan_models/fit_pnbd_onlineretail_fixed3-1.csvWarning: non-fatal error reading adaptation data
, /home/rstudio/btydwork/stan_models/fit_pnbd_onlineretail_fixed3-2.csvWarning: non-fatal error reading adaptation data
, /home/rstudio/btydwork/stan_models/fit_pnbd_onlineretail_fixed3-3.csvWarning: non-fatal error reading adaptation data
, /home/rstudio/btydwork/stan_models/fit_pnbd_onlineretail_fixed3-4.csvWarning: non-fatal error reading adaptation data
Checking sampler transitions treedepth.
Treedepth satisfactory for all transitions.
Checking sampler transitions for divergences.
No divergent transitions found.
Checking E-BFMI - sampler transitions HMC potential energy.
E-BFMI satisfactory.
Effective sample size satisfactory.
Split R-hat values satisfactory all parameters.
Processing complete, no problems detected.
4.1 Visual Diagnostics of the Sample Validity
Now that we have a sample from the posterior distribution we need to create a few different visualisations of the diagnostics.
Show code
parameter_subset <- c(
"lambda[1]", "lambda[2]", "lambda[3]", "lambda[4]",
"mu[1]", "mu[2]", "mu[3]", "mu[4]"
)
pnbd_onlineretail_fixed3_stanfit$draws(inc_warmup = FALSE) |>
mcmc_trace(pars = parameter_subset) +
expand_limits(y = 0) +
labs(
x = "Iteration",
y = "Value",
title = "Traceplot of Sample of Lambda and Mu Values"
) +
theme(axis.text.x = element_text(size = 10))We want to check the \(N_{eff}\) statistics also.
Show code
pnbd_onlineretail_fixed3_stanfit |>
neff_ratio(pars = c("lambda", "mu")) |>
as.numeric() |>
mcmc_neff() +
ggtitle("Plot of Parameter Effective Sample Sizes")4.2 Assess the Model
As we intend to run the same logic to assess each of our models, we have combined all this logic into a single function run_model_assessment, to run the simulations and combine the datasets.
4.2.1 Check In-Sample Data Validation
We first check the model against the in-sample data.
Show code
assess_data_lst <- run_model_assessment(
model_stanfit = pnbd_onlineretail_fixed3_stanfit,
insample_tbl = customer_fit_stats_tbl,
outsample_tbl = customer_valid_stats_tbl,
fit_label = "pnbd_onlineretail_fixed3",
fit_end_dttm = use_fit_end_date |> as.POSIXct(),
valid_start_dttm = use_valid_start_date |> as.POSIXct(),
valid_end_dttm = use_valid_end_date |> as.POSIXct(),
sim_seed = 30
)
insample_plots_lst <- create_model_assessment_plots(
obsdata_tbl = obs_fitdata_tbl,
simdata_tbl = assess_data_lst$model_fit_simstats_tbl
)
insample_plots_lst$multi_plot |> print()Show code
insample_plots_lst$total_plot |> print()Show code
insample_plots_lst$quant_plot |> print()This fit looks reasonable and appears to capture most of the aspects of the data used to fit it. Given that this is a synthetic dataset, this is not surprising, but at least we appreciate that our model is valid.
4.2.2 Check Out-of-Sample Data Validation
We now repeat for the out-of-sample data.
Show code
outsample_plots_lst <- create_model_assessment_plots(
obsdata_tbl = obs_validdata_tbl,
simdata_tbl = assess_data_lst$model_valid_simstats_tbl
)
outsample_plots_lst$multi_plot |> print()Show code
outsample_plots_lst$total_plot |> print()Show code
outsample_plots_lst$quant_plot |> print()4.3 Write Assess Data to Disk
We now store this assessment data to disk.
Show code
assess_data_lst |> write_rds("data/pnbd_onlineretail_fixed3_assess_data_lst.rds")5 Fit Narrow-Short-Lifetime Model
We now want to try a model where we use priors with a tighter coefficient of variation for lifetime but keep the CoV for transaction frequency.
Show code
stan_modelname <- "pnbd_onlineretail_fixed4"
stanfit_prefix <- str_c("fit_", stan_modelname)
stanfit_seed <- stanfit_seed + 1
stan_data_lst <- customer_fit_stats_tbl |>
select(customer_id, x, t_x, T_cal) |>
compose_data(
lambda_mn = 0.25,
lambda_cv = 1.00,
mu_mn = 0.20,
mu_cv = 0.30,
)
pnbd_onlineretail_fixed4_stanfit <- pnbd_fixed_stanmodel$sample(
data = stan_data_lst,
chains = 4,
iter_warmup = 500,
iter_sampling = 500,
seed = stanfit_seed,
save_warmup = TRUE,
output_dir = stan_modeldir,
output_basename = stanfit_prefix,
)Running MCMC with 4 chains, at most 8 in parallel...
Chain 1 Iteration: 1 / 1000 [ 0%] (Warmup)
Chain 2 Iteration: 1 / 1000 [ 0%] (Warmup)
Chain 3 Iteration: 1 / 1000 [ 0%] (Warmup)
Chain 4 Iteration: 1 / 1000 [ 0%] (Warmup)
Chain 3 Iteration: 100 / 1000 [ 10%] (Warmup)
Chain 1 Iteration: 100 / 1000 [ 10%] (Warmup)
Chain 4 Iteration: 100 / 1000 [ 10%] (Warmup)
Chain 3 Iteration: 200 / 1000 [ 20%] (Warmup)
Chain 1 Iteration: 200 / 1000 [ 20%] (Warmup)
Chain 4 Iteration: 200 / 1000 [ 20%] (Warmup)
Chain 2 Iteration: 100 / 1000 [ 10%] (Warmup)
Chain 1 Iteration: 300 / 1000 [ 30%] (Warmup)
Chain 3 Iteration: 300 / 1000 [ 30%] (Warmup)
Chain 4 Iteration: 300 / 1000 [ 30%] (Warmup)
Chain 1 Iteration: 400 / 1000 [ 40%] (Warmup)
Chain 2 Iteration: 200 / 1000 [ 20%] (Warmup)
Chain 3 Iteration: 400 / 1000 [ 40%] (Warmup)
Chain 4 Iteration: 400 / 1000 [ 40%] (Warmup)
Chain 1 Iteration: 500 / 1000 [ 50%] (Warmup)
Chain 1 Iteration: 501 / 1000 [ 50%] (Sampling)
Chain 2 Iteration: 300 / 1000 [ 30%] (Warmup)
Chain 3 Iteration: 500 / 1000 [ 50%] (Warmup)
Chain 3 Iteration: 501 / 1000 [ 50%] (Sampling)
Chain 4 Iteration: 500 / 1000 [ 50%] (Warmup)
Chain 4 Iteration: 501 / 1000 [ 50%] (Sampling)
Chain 1 Iteration: 600 / 1000 [ 60%] (Sampling)
Chain 2 Iteration: 400 / 1000 [ 40%] (Warmup)
Chain 3 Iteration: 600 / 1000 [ 60%] (Sampling)
Chain 1 Iteration: 700 / 1000 [ 70%] (Sampling)
Chain 4 Iteration: 600 / 1000 [ 60%] (Sampling)
Chain 2 Iteration: 500 / 1000 [ 50%] (Warmup)
Chain 2 Iteration: 501 / 1000 [ 50%] (Sampling)
Chain 3 Iteration: 700 / 1000 [ 70%] (Sampling)
Chain 1 Iteration: 800 / 1000 [ 80%] (Sampling)
Chain 4 Iteration: 700 / 1000 [ 70%] (Sampling)
Chain 2 Iteration: 600 / 1000 [ 60%] (Sampling)
Chain 3 Iteration: 800 / 1000 [ 80%] (Sampling)
Chain 1 Iteration: 900 / 1000 [ 90%] (Sampling)
Chain 4 Iteration: 800 / 1000 [ 80%] (Sampling)
Chain 2 Iteration: 700 / 1000 [ 70%] (Sampling)
Chain 3 Iteration: 900 / 1000 [ 90%] (Sampling)
Chain 1 Iteration: 1000 / 1000 [100%] (Sampling)
Chain 1 finished in 123.8 seconds.
Chain 4 Iteration: 900 / 1000 [ 90%] (Sampling)
Chain 2 Iteration: 800 / 1000 [ 80%] (Sampling)
Chain 3 Iteration: 1000 / 1000 [100%] (Sampling)
Chain 3 finished in 127.3 seconds.
Chain 4 Iteration: 1000 / 1000 [100%] (Sampling)
Chain 4 finished in 131.2 seconds.
Chain 2 Iteration: 900 / 1000 [ 90%] (Sampling)
Chain 2 Iteration: 1000 / 1000 [100%] (Sampling)
Chain 2 finished in 138.3 seconds.
All 4 chains finished successfully.
Mean chain execution time: 130.2 seconds.
Total execution time: 139.1 seconds.
Show code
pnbd_onlineretail_fixed4_stanfit$summary()# A tibble: 13,009 × 10
variable mean median sd mad q5 q95 rhat ess_bulk
<chr> <num> <num> <num> <num> <num> <num> <num> <num>
1 lp__ -1.94e+5 -1.94e+5 67.9 71.2 -1.94e+5 -1.94e+5 1.00 740.
2 lambda[1] 3.06e-1 2.98e-1 0.0836 0.0787 1.86e-1 4.57e-1 1.01 2955.
3 lambda[2] 1.65e-1 1.01e-1 0.191 0.111 6.70e-3 5.45e-1 1.00 2520.
4 lambda[3] 1.62e-1 1.00e-1 0.181 0.109 6.94e-3 5.22e-1 1.00 1960.
5 lambda[4] 7.38e-2 6.80e-2 0.0350 0.0325 2.63e-2 1.40e-1 1.00 3693.
6 lambda[5] 2.31e-1 1.65e-1 0.217 0.161 1.55e-2 6.66e-1 1.00 1839.
7 lambda[6] 3.02e-1 2.53e-1 0.210 0.179 5.78e-2 7.10e-1 1.00 2295.
8 lambda[7] 1.68e-1 1.12e-1 0.184 0.118 8.23e-3 4.95e-1 1.00 2408.
9 lambda[8] 1.60e-1 1.02e-1 0.173 0.108 8.55e-3 5.09e-1 1.00 2553.
10 lambda[9] 2.71e-1 2.45e-1 0.157 0.140 7.15e-2 5.69e-1 1.00 2722.
# ℹ 12,999 more rows
# ℹ 1 more variable: ess_tail <num>
We have some basic HMC-based validity statistics we can check.
Show code
pnbd_onlineretail_fixed4_stanfit$cmdstan_diagnose()Processing csv files: /home/rstudio/btydwork/stan_models/fit_pnbd_onlineretail_fixed4-1.csvWarning: non-fatal error reading adaptation data
, /home/rstudio/btydwork/stan_models/fit_pnbd_onlineretail_fixed4-2.csvWarning: non-fatal error reading adaptation data
, /home/rstudio/btydwork/stan_models/fit_pnbd_onlineretail_fixed4-3.csvWarning: non-fatal error reading adaptation data
, /home/rstudio/btydwork/stan_models/fit_pnbd_onlineretail_fixed4-4.csvWarning: non-fatal error reading adaptation data
Checking sampler transitions treedepth.
Treedepth satisfactory for all transitions.
Checking sampler transitions for divergences.
No divergent transitions found.
Checking E-BFMI - sampler transitions HMC potential energy.
E-BFMI satisfactory.
Effective sample size satisfactory.
Split R-hat values satisfactory all parameters.
Processing complete, no problems detected.
5.1 Visual Diagnostics of the Sample Validity
Now that we have a sample from the posterior distribution we need to create a few different visualisations of the diagnostics.
Show code
parameter_subset <- c(
"lambda[1]", "lambda[2]", "lambda[3]", "lambda[4]",
"mu[1]", "mu[2]", "mu[3]", "mu[4]"
)
pnbd_onlineretail_fixed4_stanfit$draws(inc_warmup = FALSE) |>
mcmc_trace(pars = parameter_subset) +
expand_limits(y = 0) +
labs(
x = "Iteration",
y = "Value",
title = "Traceplot of Sample of Lambda and Mu Values"
) +
theme(axis.text.x = element_text(size = 10))We want to check the \(N_{eff}\) statistics also.
Show code
pnbd_onlineretail_fixed4_stanfit |>
neff_ratio(pars = c("lambda", "mu")) |>
as.numeric() |>
mcmc_neff() +
ggtitle("Plot of Parameter Effective Sample Sizes")5.2 Assess the Model
As we intend to run the same logic to assess each of our models, we have combined all this logic into a single function run_model_assessment, to run the simulations and combine the datasets.
5.2.1 Check In-Sample Data Validation
We first check the model against the in-sample data.
Show code
assess_data_lst <- run_model_assessment(
model_stanfit = pnbd_onlineretail_fixed4_stanfit,
insample_tbl = customer_fit_stats_tbl,
outsample_tbl = customer_valid_stats_tbl,
fit_label = "pnbd_onlineretail_fixed4",
fit_end_dttm = use_fit_end_date |> as.POSIXct(),
valid_start_dttm = use_valid_start_date |> as.POSIXct(),
valid_end_dttm = use_valid_end_date |> as.POSIXct(),
sim_seed = 40
)
insample_plots_lst <- create_model_assessment_plots(
obsdata_tbl = obs_fitdata_tbl,
simdata_tbl = assess_data_lst$model_fit_simstats_tbl
)
insample_plots_lst$multi_plot |> print()Show code
insample_plots_lst$total_plot |> print()Show code
insample_plots_lst$quant_plot |> print()This fit looks reasonable and appears to capture most of the aspects of the data used to fit it. Given that this is a synthetic dataset, this is not surprising, but at least we appreciate that our model is valid.
5.2.2 Check Out-of-Sample Data Validation
We now repeat for the out-of-sample data.
Show code
outsample_plots_lst <- create_model_assessment_plots(
obsdata_tbl = obs_validdata_tbl,
simdata_tbl = assess_data_lst$model_valid_simstats_tbl
)
outsample_plots_lst$multi_plot |> print()Show code
outsample_plots_lst$total_plot |> print()Show code
outsample_plots_lst$quant_plot |> print()5.3 Write Assess Data to Disk
We now store this assessment data to disk.
Show code
assess_data_lst |> write_rds("data/pnbd_onlineretail_fixed4_assess_data_lst.rds")6 R Environment
Show code
options(width = 120L)
sessioninfo::session_info()─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
setting value
version R version 4.2.3 (2023-03-15)
os Ubuntu 22.04.2 LTS
system x86_64, linux-gnu
ui X11
language (EN)
collate en_US.UTF-8
ctype en_US.UTF-8
tz Europe/Dublin
date 2023-05-12
pandoc 2.19.2 @ /usr/lib/rstudio-server/bin/quarto/bin/tools/ (via rmarkdown)
─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
package * version date (UTC) lib source
abind 1.4-5 2016-07-21 [1] RSPM (R 4.2.0)
arrayhelpers 1.1-0 2020-02-04 [1] RSPM (R 4.2.0)
backports 1.4.1 2021-12-13 [1] RSPM (R 4.2.0)
base64enc 0.1-3 2015-07-28 [1] RSPM (R 4.2.0)
bayesplot * 1.10.0 2022-11-16 [1] RSPM (R 4.2.0)
boot 1.3-28.1 2022-11-22 [2] CRAN (R 4.2.3)
bridgesampling 1.1-2 2021-04-16 [1] RSPM (R 4.2.0)
brms * 2.19.0 2023-03-14 [1] RSPM (R 4.2.0)
Brobdingnag 1.2-9 2022-10-19 [1] RSPM (R 4.2.0)
cachem 1.0.7 2023-02-24 [1] RSPM (R 4.2.0)
callr 3.7.3 2022-11-02 [1] RSPM (R 4.2.0)
checkmate 2.1.0 2022-04-21 [1] RSPM (R 4.2.0)
cli 3.6.1 2023-03-23 [1] RSPM (R 4.2.0)
cmdstanr * 0.5.3 2023-05-09 [1] Github (stan-dev/cmdstanr@22b391e)
coda 0.19-4 2020-09-30 [1] RSPM (R 4.2.0)
codetools 0.2-19 2023-02-01 [2] CRAN (R 4.2.3)
colorspace 2.1-0 2023-01-23 [1] RSPM (R 4.2.0)
colourpicker 1.2.0 2022-10-28 [1] RSPM (R 4.2.0)
conflicted * 1.2.0 2023-02-01 [1] RSPM (R 4.2.0)
cowplot * 1.1.1 2020-12-30 [1] RSPM (R 4.2.0)
crayon 1.5.2 2022-09-29 [1] RSPM (R 4.2.0)
crosstalk 1.2.0 2021-11-04 [1] RSPM (R 4.2.0)
digest 0.6.31 2022-12-11 [1] RSPM (R 4.2.0)
directlabels * 2021.1.13 2021-01-16 [1] RSPM (R 4.2.0)
distributional 0.3.2 2023-03-22 [1] RSPM (R 4.2.0)
dplyr * 1.1.1 2023-03-22 [1] RSPM (R 4.2.0)
DT 0.27 2023-01-17 [1] RSPM (R 4.2.0)
dygraphs 1.1.1.6 2018-07-11 [1] RSPM (R 4.2.0)
ellipsis 0.3.2 2021-04-29 [1] RSPM (R 4.2.0)
evaluate 0.20 2023-01-17 [1] RSPM (R 4.2.0)
fansi 1.0.4 2023-01-22 [1] RSPM (R 4.2.0)
farver 2.1.1 2022-07-06 [1] RSPM (R 4.2.0)
fastmap 1.1.1 2023-02-24 [1] RSPM (R 4.2.0)
forcats * 1.0.0 2023-01-29 [1] RSPM (R 4.2.0)
fs * 1.6.1 2023-02-06 [1] RSPM (R 4.2.0)
furrr * 0.3.1 2022-08-15 [1] RSPM (R 4.2.0)
future * 1.32.0 2023-03-07 [1] RSPM (R 4.2.0)
gamm4 0.2-6 2020-04-03 [1] RSPM (R 4.2.0)
generics 0.1.3 2022-07-05 [1] RSPM (R 4.2.0)
ggdist 3.2.1 2023-01-18 [1] RSPM (R 4.2.0)
ggplot2 * 3.4.2 2023-04-03 [1] RSPM (R 4.2.0)
globals 0.16.2 2022-11-21 [1] RSPM (R 4.2.0)
glue * 1.6.2 2022-02-24 [1] RSPM (R 4.2.0)
gridExtra 2.3 2017-09-09 [1] RSPM (R 4.2.0)
gtable 0.3.3 2023-03-21 [1] RSPM (R 4.2.0)
gtools 3.9.4 2022-11-27 [1] RSPM (R 4.2.0)
hms 1.1.3 2023-03-21 [1] RSPM (R 4.2.0)
htmltools 0.5.5 2023-03-23 [1] RSPM (R 4.2.0)
htmlwidgets 1.6.2 2023-03-17 [1] RSPM (R 4.2.0)
httpuv 1.6.9 2023-02-14 [1] RSPM (R 4.2.0)
igraph 1.4.2 2023-04-07 [1] RSPM (R 4.2.0)
inline 0.3.19 2021-05-31 [1] RSPM (R 4.2.0)
jsonlite 1.8.4 2022-12-06 [1] RSPM (R 4.2.0)
knitr 1.42 2023-01-25 [1] RSPM (R 4.2.0)
labeling 0.4.2 2020-10-20 [1] RSPM (R 4.2.0)
later 1.3.0 2021-08-18 [1] RSPM (R 4.2.0)
lattice 0.20-45 2021-09-22 [2] CRAN (R 4.2.3)
lifecycle 1.0.3 2022-10-07 [1] RSPM (R 4.2.0)
listenv 0.9.0 2022-12-16 [1] RSPM (R 4.2.0)
lme4 1.1-32 2023-03-14 [1] RSPM (R 4.2.0)
loo 2.6.0 2023-03-31 [1] RSPM (R 4.2.0)
lubridate * 1.9.2 2023-02-10 [1] RSPM (R 4.2.0)
magrittr * 2.0.3 2022-03-30 [1] RSPM (R 4.2.0)
markdown 1.6 2023-04-07 [1] RSPM (R 4.2.0)
MASS 7.3-58.2 2023-01-23 [2] CRAN (R 4.2.3)
Matrix 1.5-3 2022-11-11 [2] CRAN (R 4.2.3)
matrixStats 0.63.0 2022-11-18 [1] RSPM (R 4.2.0)
memoise 2.0.1 2021-11-26 [1] RSPM (R 4.2.0)
mgcv 1.8-42 2023-03-02 [2] CRAN (R 4.2.3)
mime 0.12 2021-09-28 [1] RSPM (R 4.2.0)
miniUI 0.1.1.1 2018-05-18 [1] RSPM (R 4.2.0)
minqa 1.2.5 2022-10-19 [1] RSPM (R 4.2.0)
munsell 0.5.0 2018-06-12 [1] RSPM (R 4.2.0)
mvtnorm 1.1-3 2021-10-08 [1] RSPM (R 4.2.0)
nlme 3.1-162 2023-01-31 [2] CRAN (R 4.2.3)
nloptr 2.0.3 2022-05-26 [1] RSPM (R 4.2.0)
parallelly 1.35.0 2023-03-23 [1] RSPM (R 4.2.0)
pillar 1.9.0 2023-03-22 [1] RSPM (R 4.2.0)
pkgbuild 1.4.0 2022-11-27 [1] RSPM (R 4.2.0)
pkgconfig 2.0.3 2019-09-22 [1] RSPM (R 4.2.0)
plyr 1.8.8 2022-11-11 [1] RSPM (R 4.2.0)
posterior * 1.4.1 2023-03-14 [1] RSPM (R 4.2.0)
prettyunits 1.1.1 2020-01-24 [1] RSPM (R 4.2.0)
processx 3.8.1 2023-04-18 [1] RSPM (R 4.2.0)
projpred 2.5.0 2023-04-05 [1] RSPM (R 4.2.0)
promises 1.2.0.1 2021-02-11 [1] RSPM (R 4.2.0)
ps 1.7.5 2023-04-18 [1] RSPM (R 4.2.0)
purrr * 1.0.1 2023-01-10 [1] RSPM (R 4.2.0)
quadprog 1.5-8 2019-11-20 [1] RSPM (R 4.2.0)
R6 2.5.1 2021-08-19 [1] RSPM (R 4.2.0)
Rcpp * 1.0.10 2023-01-22 [1] RSPM (R 4.2.0)
RcppParallel 5.1.7 2023-02-27 [1] RSPM (R 4.2.0)
readr * 2.1.4 2023-02-10 [1] RSPM (R 4.2.0)
reshape2 1.4.4 2020-04-09 [1] RSPM (R 4.2.0)
rlang * 1.1.0 2023-03-14 [1] RSPM (R 4.2.0)
rmarkdown 2.21 2023-03-26 [1] RSPM (R 4.2.0)
rstan 2.21.8 2023-01-17 [1] RSPM (R 4.2.0)
rstantools 2.3.1 2023-03-30 [1] RSPM (R 4.2.0)
rstudioapi 0.14 2022-08-22 [1] RSPM (R 4.2.0)
rsyslog * 1.0.2 2021-06-04 [1] RSPM (R 4.2.0)
scales * 1.2.1 2022-08-20 [1] RSPM (R 4.2.0)
sessioninfo 1.2.2 2021-12-06 [1] RSPM (R 4.2.0)
shiny 1.7.4 2022-12-15 [1] RSPM (R 4.2.0)
shinyjs 2.1.0 2021-12-23 [1] RSPM (R 4.2.0)
shinystan 2.6.0 2022-03-03 [1] RSPM (R 4.2.0)
shinythemes 1.2.0 2021-01-25 [1] RSPM (R 4.2.0)
StanHeaders 2.21.0-7 2020-12-17 [1] RSPM (R 4.2.0)
stringi 1.7.12 2023-01-11 [1] RSPM (R 4.2.0)
stringr * 1.5.0 2022-12-02 [1] RSPM (R 4.2.0)
svUnit 1.0.6 2021-04-19 [1] RSPM (R 4.2.0)
tensorA 0.36.2 2020-11-19 [1] RSPM (R 4.2.0)
threejs 0.3.3 2020-01-21 [1] RSPM (R 4.2.0)
tibble * 3.2.1 2023-03-20 [1] RSPM (R 4.2.0)
tidybayes * 3.0.4 2023-03-14 [1] RSPM (R 4.2.0)
tidyr * 1.3.0 2023-01-24 [1] RSPM (R 4.2.0)
tidyselect 1.2.0 2022-10-10 [1] RSPM (R 4.2.0)
tidyverse * 2.0.0 2023-02-22 [1] RSPM (R 4.2.0)
timechange 0.2.0 2023-01-11 [1] RSPM (R 4.2.0)
tzdb 0.3.0 2022-03-28 [1] RSPM (R 4.2.0)
utf8 1.2.3 2023-01-31 [1] RSPM (R 4.2.0)
vctrs 0.6.2 2023-04-19 [1] RSPM (R 4.2.0)
withr 2.5.0 2022-03-03 [1] RSPM (R 4.2.0)
xfun 0.38 2023-03-24 [1] RSPM (R 4.2.0)
xtable 1.8-4 2019-04-21 [1] RSPM (R 4.2.0)
xts 0.13.1 2023-04-16 [1] RSPM (R 4.2.0)
yaml 2.3.7 2023-01-23 [1] RSPM (R 4.2.0)
zoo 1.8-12 2023-04-13 [1] RSPM (R 4.2.0)
[1] /usr/local/lib/R/site-library
[2] /usr/local/lib/R/library
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Show code
options(width = 80L)